-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to google.golang.org/grpc v1.66.2 / modify certain protobuf messages to retain their unmarshaling buffer #9401
Conversation
06c9166
to
c6b77ea
Compare
Some tests are breaking for this upgrade, have to fix them. |
7ed1789
to
b61cbbd
Compare
Edit: Memory unsafe optimizations are back again :) Solved by letting relevant protobuf messages keep an unmarshalling buffer reference. |
d79ea9f
to
f484e3a
Compare
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
ba307e7
to
35b8b14
Compare
Signed-off-by: Arve Knudsen <[email protected]>
35b8b14
to
7251bb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for addressing all my comments.
Follow up on #9401 While the write path has been carefully crafted to make sure that we don't keep references to the strings backed by arrays with data from grpc stream (because it would cause a memory leak), this doesn't seem to be the case in the query path, where some tests started failing after we started to reuse those backing arrays through the usage of memory buffers implemented in the new gRPC library. This change reverts the buffer freeing, means that it won't be recycled (and will be garbage collected) to ensure data correctness, while we investigate where the data references are kept. Signed-off-by: Oleg Zaytsev <[email protected]>
Follow up on #9401 While the write path has been carefully crafted to make sure that we don't keep references to the strings backed by arrays with data from grpc stream (because it would cause a memory leak), this doesn't seem to be the case in the query path, where some tests started failing after we started to reuse those backing arrays through the usage of memory buffers implemented in the new gRPC library. This change reverts the buffer freeing, means that it won't be recycled (and will be garbage collected) to ensure data correctness, while we investigate where the data references are kept. Signed-off-by: Oleg Zaytsev <[email protected]>
…otobuf messages to retain their unmarshaling buffer (#9401) Signed-off-by: Yuri Nikolic <[email protected]>
* Revert "Don't free buffers after reading query stream (#9721)" This reverts commit f7b6017. * Revert: Upgrade to google.golang.org/grpc v1.66.2 / modify certain protobuf messages to retain their unmarshaling buffer (#9401) Signed-off-by: Yuri Nikolic <[email protected]> * Revert "Distributor.queryIngesterStream: Free gRPC buffers upon error (#9756)" This reverts commit eda1a4b. --------- Signed-off-by: Yuri Nikolic <[email protected]>
* [release-2.14] fix(deps): update github.com/thanos-io/objstore digest to f90c89a (main) (#9625) * Update github.com/thanos-io/objstore digest to f90c89a (#9534) Signed-off-by: Arve Knudsen <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> (cherry picked from commit 3c97a61) * update changelog Signed-off-by: Vladimir Varankin <[email protected]> --------- Signed-off-by: Vladimir Varankin <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Vladimir Varankin <[email protected]> * 2.14.1 Prepare patch release (#9796) * bump patch version Signed-off-by: Vladimir Varankin <[email protected]> * rebuild assets Signed-off-by: Vladimir Varankin <[email protected]> --------- Signed-off-by: Vladimir Varankin <[email protected]> * MQE: Fix handling of string results (#9803) This would previously panic after the query was closed * chore(deps): update grafana/mimirtool docker tag to v2.14.1 (#9806) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Revert upgrade to google.golang.org/grpc v1.66.2 (#9811) * Revert "Don't free buffers after reading query stream (#9721)" This reverts commit f7b6017. * Revert: Upgrade to google.golang.org/grpc v1.66.2 / modify certain protobuf messages to retain their unmarshaling buffer (#9401) Signed-off-by: Yuri Nikolic <[email protected]> * Revert "Distributor.queryIngesterStream: Free gRPC buffers upon error (#9756)" This reverts commit eda1a4b. --------- Signed-off-by: Yuri Nikolic <[email protected]> * Improve lock contention affecting read and write latencies during TSDB head compaction (cherry-pick Prometheus PR 15242) (#9822) * Cherry-pick Prometheus PR 15242 Signed-off-by: Marco Pracucci <[email protected]> * Added CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]> * Updated CHANGELOG Signed-off-by: Marco Pracucci <[email protected]> --------- Signed-off-by: Marco Pracucci <[email protected]> --------- Signed-off-by: Vladimir Varankin <[email protected]> Signed-off-by: Yuri Nikolic <[email protected]> Signed-off-by: Marco Pracucci <[email protected]> Co-authored-by: Grot (@grafanabot) <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Vladimir Varankin <[email protected]> Co-authored-by: Joshua Hesketh <[email protected]> Co-authored-by: Đurica Yuri Nikolić <[email protected]> Co-authored-by: Marco Pracucci <[email protected]>
What this PR does
Upgrade google.golang.org/grpc from v1.65.0 to v1.66.2. The performance regression referenced from dskit's reversal to v1.65.0 is fixed in v1.66.2, and the potential functional regression also referenced in said reversal is Loki specific.
What we found out is that v1.66 fixes a serious performance bottleneck in conjunction with compression; the
decompress
function is optimized through memory pooling, whereas v1.65.0 would useio.ReadAll
and cause a lot of allocations during decompression. As a result, we saw ingester CPU usage increase by ~40% when enabling gRPC compression. The increase is now much more modest, maybe ~7% based on observations.In order to keep allowing for unsafe references to unmarshalling buffers in e.g.
LabelAdapter
, I had to introduce a custom gRPC unmarshalling hook (mimirpb.CodecV2.Unmarshal
) plus a scheme for protobuf messages to keep a reference to their unmarshalling buffer (mimirpb.UnmarshalerV2
). The underlying reason is that from v1.66 on, gRPC immediately recycles each buffer after unmarshalling, unless a reference is kept. This causes data races with e.g.LabelAdapter
taking unsafe string references to the unmarshal buffer, unless, as mentioned, a buffer reference is kept with the root protobuf message.Ideally, one should call
FreeBuffer
on protobuf messages keeping an unmarshalling buffer reference so the buffer can be given back to the gRPC pool. If this isn't done though, there also shouldn't be a memory leak since the buffer should be garbage collected.Which issue(s) this PR fixes or relates to
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.